ConFarm: Extracting Surface Representations of Verb and Noun Constructions from Dependency Annotated Corpora of Russian

نویسنده

  • Nikita Mediankin
چکیده

ConFarm is a web service dedicated to extraction of surface representations of verb and noun constructions from dependency annotated corpora of Russian texts. Currently, the extraction of constructions with a specific lemma from SynTagRus and Russian National Corpus is available. The system provides flexible interface that allows users to finetune the output. Extracted constructions are grouped by their contents to allow for compact representation, and the groups are visualised as a graph in order to help navigating the extraction results. ConFarm differs from similar existing tools for Russian language in that it offers full constructions, as opposed to extracting separate dependents of search word or working with collocations, and allows users to discover unexpected constructions as opposed to searching for examples of a user-defined construction.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dependency Parsing for Identifying Hungarian Light Verb Constructions

Light verb constructions (LVCs) are verb and noun combinations in which the verb has lost its meaning to some degree and the noun is used in one of its original senses. They often share their syntactic pattern with other constructions (e.g. verbobject pairs) thus LVC detection can be viewed as classifying certain syntactic patterns as light verb constructions or not. In this paper, we explore a...

متن کامل

Baldwin, Timothy (2005) The Deep Lexical Acquisition of English Verb-particle Constructions, Computer Speech and Language, Special Issue on Multiword Expressions, Volume 19, Issue 4, pp. 398-414

This paper proposes a range of techniques for extracting English verb–particle constructions from raw text corpora, complete with valence information. We propose four basic methods, based on the output of a POS tagger, chunker, chunk grammar and dependency parser, respectively. We then present a combined classifier which we show to consolidate the strengths of the component methods.

متن کامل

Nouns as Components of Support Verb Constructions in the Prague Dependency Treebank

Support Verb Constructions (SVCs) are combinations of a noun denoting an event or a state and a lexical verb. From the semantic point of view, the noun seems to be a part of a complex predicate rather than the object (or subject) of the verb, despite what the surface syntax suggests. The meaning is concentrated in the noun component, whereas the semantic content of the verb is reduced or genera...

متن کامل

Non-projectivity and valency

We describe results of investigation of a specific type of discontinuous constructions, namely non-projective constructions concerning verbs and their arguments. This topic is especially important for languages with a relatively free word order, such as Czech, which is the language we have primarily worked with. For comparison, we have included some results for English. The corpora used for bot...

متن کامل

Clustering of Russian Adjective-Noun Constructions using Word Embeddings

This paper presents a method of automatic construction extraction from a large corpus of Russian. The term ‘construction’ here means a multi-word expression in which a variable can be replaced with another word from the same semantic class, for example, a glass of [water/juice/milk]. We deal with constructions that consist of a noun and its adjective modifier. We propose a method of grouping su...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016